[axonius][identity] Add Axonius Identity datastream#16620
[axonius][identity] Add Axonius Identity datastream#16620muskan-agarwal26 wants to merge 6 commits intoelastic:feature/axonius-0.1.0from
Conversation
|
Package microsoft_exchange_online_message_trace - 1.29.1 containing this change is available at https://epr.elastic.co/package/microsoft_exchange_online_message_trace/1.29.1/ |
|
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
|
Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution! |
|
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
…ntegrations into datastream-identity
| : | ||
| string(resp.Status) + " (" + string(resp.StatusCode) + ")" | ||
| ), | ||
| "asset_type": string(state.worklist.asset_type_list[0]), |
There was a problem hiding this comment.
Why does this need a string conversion?
There was a problem hiding this comment.
No need, removing string conversion for this from everywhere in code.
| on_failure: | ||
| - append: | ||
| field: error.message | ||
| value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}' |
There was a problem hiding this comment.
Check that all date and convert processors will remove the field value on failure. This field is defined as a date so if it fails we have left an unmappable value in place.
There was a problem hiding this comment.
Can this be more concisely/manageably expressed with some/most of the processors replaced with a script
| - version: 0.1.1 | ||
| changes: | ||
| - description: Add support for identity data stream. | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/16620 |
There was a problem hiding this comment.
This should be in the 0.1.1 changes. This is not being merged into main in this PR.
| name: axonius | ||
| title: Axonius | ||
| version: 0.1.0 | ||
| version: 0.1.1 |
There was a problem hiding this comment.
| version: 0.1.1 | |
| version: 0.1.0 |
| "id": "axonius-60472232-ca7b-45e6-9fa6-72e6efc41a8e", | ||
| "references": [ | ||
| { | ||
| "id": "metrics-*", |
| "type": "index-pattern" | ||
| }, | ||
| { | ||
| "id": "metrics-*", |
There was a problem hiding this comment.
logs? Also, duplicate of the index-pattern at L1275.
| { | ||
| "id": "logs-*", | ||
| "name": "kibanaSavedObjectMeta.searchSourceJSON.filter[1].meta.index", | ||
| "type": "index-pattern" | ||
| } |
There was a problem hiding this comment.
Duplicate of pattern at L1279.
| name: '{{ IngestPipeline "pipeline-account" }}' | ||
| tag: pipeline-account | ||
| if: >- | ||
| ctx.axonius?.identity?.asset_type.contains('accounts') |
There was a problem hiding this comment.
🟠 High ingest_pipeline/default.yml:605
The pipeline routing conditions on lines 605, 610, 615, and 620 use ctx.axonius?.identity?.asset_type.contains(...) which throws a NullPointerException when asset_type is null. The optional chaining only guards axonius and identity; once identity exists, the expression evaluates to null.contains(...) when asset_type is missing. This causes the entire pipeline to fail and fall through to on_failure, leaving documents incompletely processed. Consider adding null guards like ctx.axonius?.identity?.asset_type != null && ctx.axonius.identity.asset_type.contains('accounts') for all four conditions.
- ctx.axonius?.identity?.asset_type.contains('accounts')
+ ctx.axonius?.identity?.asset_type != null && ctx.axonius.identity.asset_type.contains('accounts')🤖 Copy this AI Prompt to have your agent fix this:
In file packages/axonius/data_stream/identity/elasticsearch/ingest_pipeline/default.yml around line 605:
The pipeline routing conditions on lines 605, 610, 615, and 620 use `ctx.axonius?.identity?.asset_type.contains(...)` which throws a `NullPointerException` when `asset_type` is null. The optional chaining only guards `axonius` and `identity`; once `identity` exists, the expression evaluates to `null.contains(...)` when `asset_type` is missing. This causes the entire pipeline to fail and fall through to `on_failure`, leaving documents incompletely processed. Consider adding null guards like `ctx.axonius?.identity?.asset_type != null && ctx.axonius.identity.asset_type.contains('accounts')` for all four conditions.
There was a problem hiding this comment.
This assessment is correct; there is no previous condition or action that guarantees that asset_type is not absent or null. An alternative fix is to use ctx.axonius?.identity?.asset_type?.contains('accounts') == true.
There was a problem hiding this comment.
Thank you for confirming the issue and suggesting the alternative approach. The ?.contains(...) == true pattern is indeed cleaner and more concise.
Would you like me to fix all four conditions (lines 605, 610, 615, and 620) using this approach?
There was a problem hiding this comment.
Leave all fixes for the author.
There was a problem hiding this comment.
Understood. I'll leave this for the author to fix.
There was a problem hiding this comment.
Keeping it as is.
There was a problem hiding this comment.
Sorry, I'm unable to act on this request because you do not have permissions within this repository.
| type: boolean | ||
| description: Indicates if auto-delete days for cloud recordings is configured. |
There was a problem hiding this comment.
🟡 Medium fields/fields.yml:971
axonius.identity.recording.auto_delete_cmr_days is defined as type: boolean, but the field name indicates it should store a numeric day count. When the Zoom API sends an integer like 30, the ingest pipeline's boolean conversion coerces it to true, silently dropping the actual value. Consider changing the type to long and removing this field from the boolean conversion script in pipeline-user.yml.
- name: auto_delete_cmr_days
- type: boolean
+ type: long🤖 Copy this AI Prompt to have your agent fix this:
In file packages/axonius/elasticsearch/transform/latest_identity/fields/fields.yml around lines 971-972:
`axonius.identity.recording.auto_delete_cmr_days` is defined as `type: boolean`, but the field name indicates it should store a numeric day count. When the Zoom API sends an integer like `30`, the ingest pipeline's boolean conversion coerces it to `true`, silently dropping the actual value. Consider changing the type to `long` and removing this field from the boolean conversion script in `pipeline-user.yml`.
There was a problem hiding this comment.
Do we have a test that exercises this? Is it documented somewhere? I don't see it in any tests.
There was a problem hiding this comment.
No tests currently exercise this field. I found it in the boolean conversion script at pipeline-user.yml line 775, where it's listed alongside other recording.* fields to be converted to boolean. The field is documented as type boolean with description "Indicates if auto-delete days for cloud recordings is configured." However, the field name auto_delete_cmr_days suggests it should store the actual day count (e.g., 30, 60, 90), not just a boolean flag—that's what the companion field auto_delete_cmr is for.
There was a problem hiding this comment.
The comment was actually directed to @muskan-agarwal26.
There was a problem hiding this comment.
yes, in live logs, we are getting value as true or false, hence it is kept boolean
There was a problem hiding this comment.
Sorry, I'm unable to act on this request because you do not have permissions within this repository.
There was a problem hiding this comment.
@muskan-agarwal26 Please make it so that this is exercised in our pipeline tests.
efd6
left a comment
There was a problem hiding this comment.
Please address https://github.com/elastic/integrations/pull/16620/changes#r3107948170 and the macroscopeapp concerns.
|
@efd6 I have addressed the comment: https://github.com/elastic/integrations/pull/16620/changes#r3107948170 , used script processors to reduce the count of processors. |
I don't see any change in the size of that file. |
It is actually reduced, it can be viewed in this commit: 8b75b00 |
|
@efd6 , please have a look at my comment: #16620 (comment) |
Proposed commit message
The release includes identity data stream and associated dashboard.
Axonius fields are mapped to their corresponding ECS fields where possible.
Test samples were derived from live data samples, which were subsequently
sanitized.
Checklist
changelog.ymlfile.How to test this PR locally
To test the axonius package:
Related issues
Screenshots